Data parallelism with high performance C
نویسندگان
چکیده
This paper 1 describes a preliminary design of the High-Performance C (HPC) language 2]. HPC, a machine-independent language extension to C, allows the user to write programs for distributed-memory systems using global addresses. HPC includes high-level features for specifying processor arrays and both static and dynamic data distributions across processors and for formulating explicitly loops with independent iterations. The language is based on the data parallel paradigm, which exploits the parallelism inherent in many scientiic applications .This paper focuses on HPC features that are beyond High-Performance Fortran (HPF) in the areas of sequential and data parallel extensions .
منابع مشابه
High Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation
Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...
متن کاملHigh Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation
Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...
متن کاملDynamic Task Parallelism with a GPU Work-Stealing Runtime System
NVIDIA’s Compute Unified Device Architecture (CUDA) and its attached C/C++ based API went a long way towards making GPUs more accessible to mainstream programming. So far, the use of GPUs for high performance computing has been primarily restricted to data parallel applications, and with good reason. The high number of computational cores and high memory bandwidth supported by the device makes ...
متن کاملCppSs - a C++ Library for Efficient Task Parallelism
We present the C++ library CppSs (C++ superscalar), which provides efficient task-parallelism without the need for special compilers or other software. Any C++ compiler that supports C++11 is sufficient. CppSs features different directionality clauses for defining data dependencies. While the variable argument lists of the taskified functions are evaluated at compile time, the resulting task de...
متن کاملHigh Performance Parallel Database Processing and Grid Databases
data parallelism for a decision tree, 489–492 data set structure, 479–480 decision tree algorithm, 480–481 decision tree classification, 477–480 processes, 480–488 structure, 478–479 result parallelism for the decision tree, 492–495
متن کامل